Support Multiple language_code Values per Entry for WhisperLiveKit Compatibility#3
Open
Pactortester wants to merge 1 commit intoQuentinFuxa:mainfrom
Open
Support Multiple language_code Values per Entry for WhisperLiveKit Compatibility#3Pactortester wants to merge 1 commit intoQuentinFuxa:mainfrom
language_code Values per Entry for WhisperLiveKit Compatibility#3Pactortester wants to merge 1 commit intoQuentinFuxa:mainfrom
Conversation
…patibility When using WhisperLiveKit with NLLW, the --lan parameter is passed to both Whisper and NLLW. Whisper only accepts "zh" for Chinese while NLLW previously only accepted "zh-CN", causing incompatibility. Changes: - Allow language_code field to be a list (e.g., ["zh-CN", "zh"]) - Add helper functions _get_language_codes() and _match_language_code() - Update dictionary building to map all codes to the same nllb code - Update lookup functions to support list type language_code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem Description
When using WhisperLiveKit with NLLW for voice translation, the
--lanparameter is passed to both Whisper (speech recognition) and NLLW (translation). #2Issue: Whisper only supports
zhas the Chinese language code, while NLLW previously only supportedzh-CN, causing incompatibility.Error Reproduction
Root Cause
zhzh-CNThe two systems have incompatible language codes.
Solution
Extend the
language_codefield to support list type, allowing a single language entry to map to multiple language codes:Modified Files
nllw/languages.pyChanges
1. Data Structure Change
2. New Helper Functions
3. Modified Dictionary Building Logic
4. Modified Lookup Functions
get_language_info()andlist_all_language_code_codes()functions have been updated to support list typelanguage_code.Test Verification
After the fix, the following command works correctly:
Impact
zh-CNstill works, no changes needed for existing codezhis now recognizedlanguage_codeto a listlanguage_codevalue need no modificationDesign Advantages
Compared to simply adding duplicate entries:
This solution is more elegant:
Related Projects